Stability Evaluation of Clustering Algorithms for Time Series Gene Expression Data
نویسندگان
چکیده
Different clustering techniques such as Self-Organizing Map (SOM), and hierarchical clustering, among others, have been applied to gene expression data. The focuses of theses studies are often on the biological results, and there is no indication on what methods are more suitable for clustering gene expression. In this paper, an evaluation methodology that assesses the stability of clustering methods in relation to external validation criteria is presented. In order to do so, an investigation of the extent to which similar results are obtained from analyses of subsets of the data is made. In this paper, preliminary results of applying this methodology to evaluate the stability of three clustering techniques (SOM, hierarchical clustering, and dynamical clustering) in relation to five gene expression time series data sets from Yeast cells are shown.
منابع مشابه
Fuzzy clustering of time series data: A particle swarm optimization approach
With rapid development in information gathering technologies and access to large amounts of data, we always require methods for data analyzing and extracting useful information from large raw dataset and data mining is an important method for solving this problem. Clustering analysis as the most commonly used function of data mining, has attracted many researchers in computer science. Because o...
متن کاملForecasting copper price using gene expression programming
Forecasting the prices of metals is important in many aspects of economics. Metal prices are also vital variables in financial models for revenue evaluation, which forms the basis of an effective payment regime using resource policymakers. According to the severe changes of the metal prices in the recent years, the classic estimation methods cannot correctly estimate the volatility. In order to...
متن کاملAlgorithms for Segmenting Time Series
As with most computer science problems, representation of the data is the key to ecient and eective solutions. Piecewise linear representation has been used for the representation of the data. This representation has been used by various researchers to support clustering, classication, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملCombination of Transformed-means Clustering and Neural Networks for Short-Term Solar Radiation Forecasting
In order to provide an efficient conversion and utilization of solar power, solar radiation datashould be measured continuously and accurately over the long-term period. However, the measurement ofsolar radiation is not available to all countries in the world due to some technical and fiscal limitations. Hence,several studies were proposed in the literature to find mathematical and physical mod...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002